Skip to content

Conversation

ljyanesm
Copy link
Contributor

@ljyanesm ljyanesm commented Apr 4, 2025

This enables federated identity, there's a caveat to this PR which the maintainers may want to be aware of around the validation of the GOOGLE_APPLICATION_CREDENTIALS.

See https://cloud.google.com/docs/authentication/client-libraries#validate_other_credential_configurations for further details.

Closes #390


Contributor checklist:

  • I have read and understood CONTRIBUTING.md
  • Confirmed an issue exists for the PR, and the text Closes #issue appears in the PR summary (e.g., Closes #123).
  • Confirmed PR is rebased onto the latest base
  • Confirmed failure before change and success after change
  • Any generic new functionality is replicated across cloud providers if necessary
  • Tested manually against live server backend for at least one provider
  • Added tests for any new functionality
  • Linting passes locally
  • Tests pass locally
  • Updated HISTORY.md with the issue that is addressed and the PR you are submitting. If the top section is not `## UNRELEASED``, then you need to add a new section to the top of the document for your change.

…torage client

This enables federated identity

Updates HISTORY.md
@ljyanesm ljyanesm requested a review from pjbull April 9, 2025 12:53
@pjbull
Copy link
Member

pjbull commented Apr 12, 2025

Thanks for the patience @ljyanesm.

I am wondering if this logic works:

        # don't check `GOOGLE_APPLICATION_CREDENTIALS` since `google_default_auth` already does that

        # use explicit client
        if storage_client is not None:
            self.client = storage_client

        # use explicit credentials
        elif credentials is not None:
            self.client = StorageClient(credentials=credentials, project=project)

        # use explicit credential file
        elif application_credentials is not None:
            self.client = StorageClient.from_service_account_json(application_credentials)
        
        # use default credentials based on SDK precedence
        else:
            try:
                # use `google_default_auth` instead of `StorageClient()` since it
                # handles precedence of creds in different locations properly
                credentials, default_project = google_default_auth()
                project = project or default_project  # use explicit project if present
                self.client = StorageClient(credentials=credentials, project=project)            
            except DefaultCredentialsError:
                self.client = StorageClient.create_anonymous_client()

Also, can we update the docstring with whatever the precedence method we implement is and links to the google docs that explain what the SDK does?

@ljyanesm
Copy link
Contributor Author

Thanks @pjbull, just checked identity federation auth is working as well as key based auth.

Copy link

codecov bot commented Apr 16, 2025

Codecov Report

❌ Patch coverage is 50.00000% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 92.3%. Comparing base (a4d47f9) to head (ae5d37f).
⚠️ Report is 6 commits behind head on 514-local.

Files with missing lines Patch % Lines
cloudpathlib/gs/gsclient.py 50.0% 2 Missing ⚠️

❌ Your patch status has failed because the patch coverage (50.0%) is below the target coverage (80.0%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@             Coverage Diff             @@
##           514-local    #514     +/-   ##
===========================================
- Coverage       93.4%   92.3%   -1.1%     
===========================================
  Files             23      23             
  Lines           1800    1801      +1     
===========================================
- Hits            1682    1664     -18     
- Misses           118     137     +19     
Files with missing lines Coverage Δ
cloudpathlib/gs/gsclient.py 90.0% <50.0%> (-1.3%) ⬇️

... and 2 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@pjbull
Copy link
Member

pjbull commented Apr 16, 2025

@ljyanesm Not sure yet, but first look at failing tests suggests we may need to mock an additional part of the Google SDK to make the mocked tests work.

If you can repro locally and take a look that'd be great.

@ljyanesm
Copy link
Contributor Author

@pjbull

@ljyanesm Not sure yet, but first look at failing tests suggests we may need to mock an additional part of the Google SDK to make the mocked tests work.

If you can repro locally and take a look that'd be great.

I've tried to repro matching the python version (3.13), but could not get a repro locally.

In fact, locally only tests/test_s3_specific.py::test_aws_endpoint_url_env fails due to a mismatch in the S3 endpoint.

@pjbull
Copy link
Member

pjbull commented Apr 17, 2025

@ljyanesm New method is too smart!

In order to repro, I needed to:

  • Remove any GC creds from my .env and active environment variables
  • Move/rename .gscreds.json in my active directory
  • gcloud auth revoke --all to revoke credentials
  • gcloud auth application-default revoke to revoke/remove my default credentials.

After doing that on MacOS with Python 3.13 I get the same error as the test suite. I don't have the time to look into a fix quite yet, but maybe this could unblock you to investigate. Thanks!

@pjbull pjbull changed the base branch from master to 514-local August 2, 2025 19:57
@pjbull
Copy link
Member

pjbull commented Aug 2, 2025

Merging into a local branch to finish this PR

@pjbull pjbull merged commit e82764b into drivendataorg:514-local Aug 2, 2025
13 of 21 checks passed
pjbull pushed a commit that referenced this pull request Aug 4, 2025
* Refactor authentication to use default credentials for Google Cloud Storage client

This enables federated identity

Updates HISTORY.md

* Keeps same functionality for API whilst enhancing the env var alternative

* Simplifies credential handling logic, and updates docstring

* Updates HISTORY.md

---------

Co-authored-by: ljyanesm <[email protected]>
pjbull added a commit that referenced this pull request Aug 4, 2025
* Refactor GS authentication to use default credentials (#514)

* Refactor authentication to use default credentials for Google Cloud Storage client

This enables federated identity

Updates HISTORY.md

* Keeps same functionality for API whilst enhancing the env var alternative

* Simplifies credential handling logic, and updates docstring

* Updates HISTORY.md

---------

Co-authored-by: ljyanesm <[email protected]>

* bonus: test fixes and docs

* Mock default auth

* fix history

---------

Co-authored-by: Luis Yanes <[email protected]>
Co-authored-by: ljyanesm <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

GSClient auth fails with token-based application credentials.json
2 participants